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ABSTRACT 


A literature survey on various types of super-population models 
and their uses in finite population sampling is given. Topics like 
optimum sampling, balanced samples and randomization in survey sampling 
are discussed in detail. We have proved some properties of univariate 
distribution of quantile of a sample from a finite population. One of 
our main objectives was to explore the possible use of auxiliary 
informations for estimating finite population quantiles. Keeping this 
goal in our mind we have derived the bivariate distribution of smaple 
quantiles and their asymptotic distribution. Assuming a certain super- 
population model we have suggested an estimator for finite population 
quantiles which involves the auxiliary variable. Further study will be 


required to establish useful properties of the suggested estimator. 
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CHAPTER I 


INTRODUCTION 


In sample survey theory the basic assumption is that we have 
a fixed finite population of N identifiable units under study. Due 
to lack of time and resources which includes money, expertise, etc., we 
are constrained to study only a part of the population concerned. Of 
course, there are situations where a complete study of the population 
is not at all feasible and one has to depend on sample survey methods. 
The objective in survey sampling is to make inference about some 
characteristics of the population. Most of the literature on Sample 
Surveys dealswith the estimation of the population total, population 
mean, population proportion and standard errors of their estimates. 

The object of this thesis is to study recent developments in 
survey sampling dealing with the estimation of quantiles of finite 
population and inference under super population models. It is a common 
practice to use auxiliary information for estimating population mean 
or total. Analogously, our interest in this work is to study the use 
of auxiliary information for improving the estimation of finite 
population quantiles. For example, can we use with benefit the inform- 
ation on quantiles of auxiliary variables in estimating the quantile 
of the main variables? Keeping this objective in mind we have derived 
the bivariate distribution of sample quantiles and studied its 
asymptotic behaviour. 

In this chapter we shall try to point out some shortcomings 


of the conventinal approach and a brief history on the development of 
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the model based approach. to inference in survey sampling as an alter- 
native to the conventional one. 

Let there be N < © units in the population. N is called 
the population size. The units are identifiable, that is units of the 
population can be uniquely labelled from 1 tto N and the label of 
each unit is known. We can denote the population by Y= Lie. NS 
With each unit i, there is associated a measurement Ys on a variable 
character y. For all practical purposes Ba isreal.. forall = "te Re. 
We shall represent the auxiliary information on U, by a real measure- 
ment x, Tree vector X.° Our target is to estimate the population 
total, y= ) Ya: The method is to draw a representative sample of the 
population aa on the basis of sample observations we have to estimate 
population total. Let s = {i,»---,i} be sampled units without 
repetition and Ss =Ue-s ae te be units not in the sample s. 


Then, we have population total, 


(V1) y = a ei eg 


In estimating y, the first sum on the right hand side of (1.1) is 
exactly known to us (assuming there is no measurement error) and our 
object is to estimate the second sum, namely the total of the non-sampled 
part of the population. Survey methodology available in common survey 
sampling textbooks, (e.g. Cochran, 1977) is devoted primarily to finding 
a good survey design, suitable to the practical situtation for estimating 
the unknown part of the population total. This approach is now commonly 
known as the conventional approach or the design based approach. At this 


point it is interesting to mention a few lines from Basu (1969): 
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"The objective of planning a survey should be to end up with 
a good sample. The term 'representative sample’ has been useed in survey 
terminology. But no one has cared to give a precise definition of the 
term. It is implicitly taken for granted that statistician with his 
biased mind is unable to select a representative sample. Soa 
simplistic solution is sought by turning to an unbiased die (the 
random number tables). Thus, a deaf and dumb die is supposed to do 
the job of selecting a 'representative sample’ better than a trained 


statistician." 


Broadly speaking, there are three main methods of estimation 
of finite population parameters. These are: Methods bgeed on 
1. measurements of units which are exact, that is, there is no 
error in measurements; 
2. measurements which are not exact but subject to random 
errors; and 
3. knowledge of some process which generates the measurements on 


a given unit. 


Traditionally, randomization has been regarded as an essential part of 
survey sampling for objective inferences and estimability of the 
standard errors of estimates. In Case 1 above, randomization is created 
by the sampler through specific survey designs. This was the approach 
adopted by statisticians in developing the subject of sample survey. 

In this design based approach it is assumed that the population values 
Yyrrr Vy are, fixed -andihence™y = {Yj o+++9¥u} can be treated as a 
parameter of the population under consideration. Our interest is on 


some function of this parameter say g(y). There are certain authors, 
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for example, Neyman (1971) who tend to focus attention only on 
estimation based on man-made randomization in the form of design. 

In Case 2, there are two sources of randomization, (i) created 
randomization based on survey design, and (ii) random error associated 
with the measurements of the sample unit which is commonly known as non- 
sampling error or bias. This Latter aspect is beyond the scope of this 
thesis. Modern development of survey sampling techniques mainly follow 
the line of Case 3. This is widely known as the super-population approach 
or model based approach. Under this approach it is assumed that to each 
population unit is associated a random variable for which a stochastic 
structure is specified. The actual value associated with the population 
unit is treated as an outcome of that random variable. We shall discuss 
various super-population models and methods of estimation under these models 


in Chapter 2. Super-population approach is an elegant development of 


statisticians, through which important new methods are currently being 
added to traditional methodology of survey sampling. Authors like Barnard 
(1971), Kalbfeisch and Sprott (1969), Royall (1970, 1971), etc. consider 
inference based on super-population models not only desirable but almost 
necessary. Some of the authors have strongly criticized the idea of the 
sample design producing the only source of randomness in data injected 
by the survey statistician himself. "The survey statistician does 

not lean on probability-theory for the purpose of understanding and 
controlling the mess created by an unavoidable source of randomness or 
uncertainty (observation error)", Basu (1969). Basu examined the random- 
ization principle in survey sampling and came to the conclusion that 
there is very little, if any, use for the survey designs. Chapter 3 


deals with the randomization principle and its alternatives. 
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Although, we have called the common and well-known approach 
of survey sampling as traditional or conventional, the idea of super- 
populations is also not new. Cochran (1939, 1946), Deming and 
Stephan (1941), Madow and Madow (1944), Mahalanobis (1944) are early 
users of the super-population idea. Deming and Stephan (1941) were first 
to clearly mention the idea of variable status of the population, 
rather than fixed. They made the comment that the census is a sample 
only and suggested that it is one of many populations that might 
have resulted. The difference between census and sample survey is a 
matter of degree and considered the population census whose state of 
nature is changing with time. Cochran (1946) first clearly assumed 
that the finite population we have at our disposal is actually a sample 
from an infinite population. He considered the population in which 
the variance among the elements in any group of contiguous elements 
increases as the size of the group increases. This type of population 
was also considered by Smith (1938), Jessen (1942), Mahalanobis (1944) 
and Hansen and Hurwitz (1943). Various mathematical models have been 
considered by these authors for representing the situation where the 
variance within a group is directly proportional to the size of the 
group measure x,. Cochran (1946) considered that elements x. are 
drawn from different populations and assumed that the population 
changes in some regular manner with the value i. Alternatively, he 
suggested that x belongs to the same population but is serially 
correlated, and found it more reasonable to consider the finite 
population as a sample from an infinite population. 

The idea of super-population models i.e. the idea of 


considering the existing population as a sample from an infite 
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population, started much earlier, unfortunately, the theoretical aspect 
of the model based approach did not attract much attention from 
Statisticians until the 1960's. It is well-known that the use of 
supplementary information in estimation of finite population parameters, 
in general, increases the accuracy of the estimator. So, samplers felt 
the need for a comparison of the relative accuracy of sample designs 
using such information. This comparison becomes difficult if we cannot 
assume any functional relationship of the data. One solution to the 
problem is to regard the finite population as a random sample from an 
infinite super-population model having certain properties. The results 
so obtained do not apply to any single finite population but to the 
average of all finite populations that can be drawn from the infinite 
population, .(Ray,. 2950). 

Early works on super-population models are based on some type 
of linear regression models with heteroscedastic error variances. 


"chance set-up" as a logical 


Hacking (1965) has proposed the concept of 
Superior to the postulate of a hypothetical infinity of populations. For 
example, the linear regression super-population model may be viewed as 
defining a random set-up rather than an infinity of hypothetical 
populations, if so desired. But, the analyses are mathematically 
identical. Forman and Brewer (1971) have given comparisons of the 
efficiencies of six methods of sampling in common use. The model they 
used (also commonly used super-population model) is an infinite set of 
theoretical populations, each of size N. Units are identifiable, 

having two measures Y, and Xx. on the ‘ah unit, where Y, is the 


measurement of the character of interest and Xx, is measurement of the 


auxiliary information (e.g. size of the unit), and related to Y; 


woaqes Lsoltsetours sit ‘anadie fotos sai be 


may? mpisussts dust asta sr 
; 10 Gee att Soa2 eee Fe at a 2880 ais iE 


7 . ' a) e 
eresquoieg poisrludqog s3tnk?) ie nl eaw ives’ alk not mo 


tiet exilguee .62 . 107ultbes Mi 3a \opTad De potenti 


. uy i : =f ay es . 
eugiveh sige Jo yoarook edagien 9ns ‘tt abiaen & 70272 
(eA i 


jum? av SP Jit) wae: ae NET IRGD9 ably ea 
_— - 
e o 
Sf9° OF an ide ok  erehcos 3 apneni ha 98 ‘ks jis 
: 


bs F 80 = SAGES aGOheet £2.24 gal Beate: sit artis hits na 
eiliauey Sd! ,gsTazegivsq Algiges ge ne obo io ste Aaa = que 


1 02 Di mi btetegos 2s Fate oft it ebs be ena ale 


7 eee % i 
ISL M old mont nes th od) as 4 igs stares ona ile es 3 


i, 
7 ae: 
«6a ie i 
% >) We é 
Yo. see fo Shesd tak: elaine Ata twd<2aque nes sasha, eae 
2 Galany. aetas Sk aebb aos youn) dae chain, eA 
S2ieol 5 ae" tesa sqasin” 4 seh afi pho gptig a {geet 


; Ns 
70% -s Shula Ain de Vining Las itodjogyil & io aha no 


b4awaly ae.) 


cA 


vilesisamisr os 4 as eal 
mid cae ee pied ten 
ous 20 inne taadees" eats ove user 900% by 


vind nepal on “ > 
Nie | 


rt 


as follows 
Ci?) Yo = Hr 8X Fe. , Le ee rate gs 
ah af i 


where, a and 8 are constants, e,'s are random variables with 


E(e,) = 0, E(e‘) = of (sometimes E(e‘) is some function of X.) 

and ES cy =0, for all i}? *j. “Here, expectation, E, is’ over all 
hypothetical populations and oF is constant over all these populations 
but varies with i. 

During the 1960's, statisticians have devoted much attention 

to the theoretical aspects of survey sampling. For a long time there 
were big gaps between survey sampling theories and statistical inference 
theories. In the traditional books of survey sampling, authors used 
the statistical inference theories under the assumption of large 
samples. The maximum likelihood method of estimation in statistical 
inference was essentially (for a long time) a failure in survey sampling 
situations. If the sample is drawn with probability proportional to 
size of the unit then how valid are traditional methods in the theory 
of hypothesis testing or the theory of statistical inferences? The 
answer of this question is still unknown. However, in the late 60's 
and early 70's it became possible to relate likelihood methods and 
Bayesian methods with finite populations. Some examples are Royall (1968, 
1976a),, Hartley and Rao (1968, 1969), Kalbfleish and Sprott (1970), 
C.R. Rao (1971), Ericson (1969a, b), Solomon and Zacks (1970), Basu 
(1969), Zacks (1969), and Godambe (1966, 1968), Godambe and Thompson 
(1971), Godambe and Joshi (1965), etc. 

The most remarkable and striking development in survey sampl- 


ing theory during the 1960's is the development of design free inferences. 
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There are some strong critics on the use of survey design for inference 
on finite populations. Godambe (1966) noted that the application of the 
likelihood principle in sampling situation would mean that the sampling 
design is irrelevant for data analysis. Basu (1969) examined the role 
of sufficiency and the likelihood principle and gave the conclusion, 
"Once the sample has been draw, the inference should not depend in any 
way on the sampling design. This poses the problem of designing a 
survey which will yield a good (representative) sample.'' He also 
examined the randomization principle (the man-made randomization 
through survey design) and pointed out very limited use, if any, for 

it in survey design. 

Carrying this idea further, statisticians in the 1970's 
started suggesting the use of subjective sampling for an optimum 
estimator. Royall (1970) suggested a subjective sample, called a 
"balanced sample', for estimating the population total. For estimating 
population total his estimator based on this balanced sample under the 
assumption of linear-regression super-population model proved to be 
most efficient. Brewer (1963) first suggested this type of purposive 
sampling. Later Royall (1973a, b) studied the robustness of the 
estimator based on balanced samples. This idea was further developed 
and extended by many other authors, namely, Holt (1975), Sigha (1976), 
Mukhopadhyay (1977), Tallis (1978), Scott, Brewer and Ho (1978), Singh 
and Garg (1979). There is considerable criticisms of this type 
of purposive sampling although the mathematical basis of this approach 
is sound. However, it seems that as yet there is no conclusive 


decision on the use of design based approach and model based approach. 
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Both approaches have some merits and demerits. Some authors are trying 
to mix these two streams. For example, Kolehmainen (1981) suggested 
that stratification of the finite population should always be made, 

if possible, and sampling within strata can be made purposively. 

Basu (1978) also suggested some type of post-stratification of data. 

In Chapter 3, we discuss this matter in greater detail. 

In Chapter 4, we discuss the use of order statistics in the 
estimation of quantiles of finite populations. There we have given 
some results on properties of the distribution of order statistics 
in finite population sampling, bivariate distribution of sample 
quantiles and estimation of quantiles using auxiliary information. 

In Chapter 5, we discuss the asymptotic behavior of some estimators 
of finite population parameters and derive the asymptotic joint 


distribution of sample quantiles. 
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CHAPTER II 


SUPER-POPULATION MODELS AND PREDICTION 


§2.1 INTRODUCTION 


In this chapter we shall study different types of super- 
population models and sampling theories based on these models. Ideas 
and write-up of this chapter are mostly as in Cassel, Sarndal and 
Wretman (1977). 

In Chapter 1, we have mentioned that the super-population model 
arises when we consider the measurement y = CFrigsais a of a finite 
population to be the outcome of a random variable, Y = (Yyose+e¥y)- 


Let-us denote. the joint distribution of 9Y. by. €. Before proceeding 


to the next section let us introduce some useful definitions. 


é k = Moss 
Ordered sample: A sequence s (k, > ok (gk)? such that k Bare 6 
for i= 1,...,n(s*) is called an ordered sample. The number of 


components of s%*, denoted by n(s*), is called the sample size. 


O 


Unordered sample: A non-empty set s such that s Uh is called 
an unordered sample. The number of elements of s, denoted by v(s), 


is called the effective sample size. 


If the context is only with unordered sample, then we shall call 


unordered sample and effective sample size simply by sample and sample 
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size respectively. The set of all sets s will be denoted by y 


Unordered sample design (or simply sample design): A function p(s) 


on ed Sati styings. p(s). 420 stor all’ ss Lye and ) p(s) = 1 will 


be called an unordered sample design. Some authors refer to the pair 


Cf, p(.)) as the design. 


The definition of the ordered sample design is similar. 


Non-informative design: A sample design p(.) is called a non- 


informative design,if and only if, p(.) is a function that does not 
depend on the y-values associated with labels in s or s*. But p(.) 


may be function of auxiliary variables. 


Fixed size design: If n(s*) or v(s) are fixed then the respective 


design is called a fixed size design. 


§2.2 DIFFERENT SUPER-POPULATION MODELS 


By super-population model or simply "model" we shall refer 
to a class of distributions — with various types of specifications. 
These specifications may be only on the first few moments of € or 
to be more specific we may assume & has some specific well-defined 
Statistical distribution. However, in both cases it is assumed that 


the vector of finite population values yan GeO ee, is an out- 
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come of the random variable Naa DY. » Ly) shaving distribution €. 


1 ny 


Definition: If Q= QM ose o¥ Deots-antuncrronsotl | Yount. oy 


N 1 N° 


E-expectation of Q, denoted by €(Q) is defined as 


(aps EQ) =. fade , 


and variance of Q, denoted by WVQ) is defined as 


(2.2) Vo) = fea) 17a 


If Q, ~ Q, Fy) +++ 9%y) and Ly TR OS DERE are two functions of 


¥ the €-covariance of or and Q,» denoted by G(Q, 52); 


Ypres Ww? 


is defined as 


(2.3) CGieo5) =" 110, 2EQ)) 1 (a, ide 
In particular, we shall define for | eae Oey Ie 

Hea eC ceieuy Ce en lo = Bye y,) \ for kd 2), 
C28) 

N N 
- ai = i 
i= — \ u and Ye =e. ) x 
N 1 k N 1 k 
a 


There are two broad classifications of models used in survey 
sampling. They are (i) general models, denoted by G and 
(ii) exchangeability models, denoted by E. Often there will be 


subscripts to further specify both Model G and Model E. 
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Model G (transformation model). This model specifies the class of 


distributions € such that, for given a, > 0 and b the variables 


k k’ 


: 2 3 Z 
have common mean U, variance oO and covariance 00 for any pair 


k # 2%. Unless, otherwise stated, in general u, o? and 9 are 
N 
unknown, - Stn ela and \ a, = N. The condition on p is 
N-1 1 k 
required to have non-negative WY). Therefore, under Model Gp 


Ye has the following moments: 


en ey = au or be 
Dns pga n2 
(2.5) Cae ee) avo 
2s eek aa pa 
Oe icf (Yo ,) = tye 9 k # 2 


Model Gp implies that the first two moments of the transformed 


variables Z aN are unchanged. So we can suitably choose a 


er: k 
and by for specifying a good sample design for the problem in hand. 
Model G aes The special case of Model Cr where a. = 1, by =20eefor 


all Kk 1s. tig Nets. Moder Gro" This model expresses that labels 


are uninformative. 
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Zz 
where ile agate and. are unknown and Wie eames uy is a set 


Of known numbers forall |k, 9k >=) 1,...,N. 


Model G. (ratio model). The class of distributions — such that 


Yiocceo ky are independently distributed, and 
w= 6) = be, of = WE) = one), bel. 
k k rigid k k: nae ’ geee WN gy 


where 8 and an are unknown. u(.) is a known function and Kyo XR 


are known positive constants. A common assumption is u(x,) = x8, where 
g is known. 

Next, let us consider various types of emctinecab meer models. 
In order to be exchangeable, the distribution & must be symmetric in 


accordance with the following definition. 


Definition. Random variables Yioreroty are called exchangeable if 


Niwa ic es gl have, for every permutation 1,,...,1r Ofer Liisa N ete 
ty Ty if N 
same joint distribution, which is called an exchangeable distribution. 


O 


The idea of exchangeable distribution in the context of finite 
eet? y themselves 


may be assumed to be exchangeable. However, it is usually assumed that 


population was given by Ericson (1965). Variables Y 


the transformed Y under change of origin and scale, are exchangeable. 


ea 


Model E_. This model defines the class of distributions € such that, 


N 
for known a, > QO and bys ke=i 12... .N. satisfying ) a= N; the 
random variables 
Z SGD ae : lea Oe Ne 


have an exchangeable absolutely continuous distribution. Common mean, 
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variance and covariance implied by the exchangeability will be denoted 

Z 2 
Doth. oO) and: ©0600 respectively. The first and second order moments 
of yy, are given by (2.5). y, 8 themselves become exchangeable in the 


following special case of Model Ee 


Model E 0° The special case of Model E. Such that ane % 1a be = Or 


for ral. kee 1 ss 

Let us now consider the discrete exchangeable super-population 
model. This is mostly known as random overt or random labeling 
models. This model was first used in Madow and Madow (1944) and not 
addressed again until Kempthorne (1969). Recent works on random per- 
mutation models are Royall (1970a), Ramakrishnan (1970), C.R. Rao (1971), 
Godambe and Thompson (1973), Rao (1975) and Rao and Bellhouse (1978). 
Under the random permutation model we assume that N population values 
of ZY are fixed but are labeled at random. So that, each permutation 
r= (Ly>+++ sty) of 1,...,N is assumed to have probability equal to 


1/N! of being assigned as labels for the units. The equivalent statement 


is that the fixed but unknown number Yprre oy are assigned randomly 
to units with labels 1,...,N so that each permutation of Vy 


has a probability equal to 1/N! for fixed labels 1,...,N. Under both 
versions it is implied that there is no systematic relationship between 


labels and corresponding YE values. This y-value corresponding to 


h 


the Ke label can be regarded as the outcome of a random variable 


Ye Let us now consider the following general model. 


Model ERp' (random permutation model). The class of distributions € 


such that, for any fixed, unknown numbers ZprreesZn and for given 


N 
numbers a. =O. [arid Die KOREA oe cto N such that y ae =a.) | Se be 
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exchangeable), their joint density function being, 
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where g(./9) is the density function for all YE 


§2.3 SOME DEFINITIONS AND TERMINOLOGIES 


In survey sampling, we mostly deal with prediction of 
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their standard errors. For inference on y we need data. Under the 
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use T() for inference on Y, then we shall call TGO®) a 
predictor or estimator of Y. Hence we can replace &) of the predictor 


TGO)2. by D to get a new predictor T(D,) for Y. This is still 


? 
a function of random variables Ye? k € s. We shall often use simply 

T for T(M) or TQ,) Hust to indicate that: T -1s 4 function:of 
random variables Ye where k may be in S orin s_ respectively. 
On the other hand the random variable obtained from T(J) for 

Ye = ? kKo= 1 ewe, 25) Lixed but. kre S$: will be: written -as t@,)- 
The: value: of T (A) for S$ = s. and Yee = Vie? k € sp will be written 
as t(d). Thus t(d) is no longer a random variable and will be termed 
as the estimate or predicted value of ve As above, we shall simply 
write. t > £or t(D,) or tid). oihe small letter” t will indicate: that 
t is a function of the realized value Yy of Ye Maas Kees Re 
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Definition. T is called a p-unbiased (design unbiased) predictor of 
y. if and only if, for a given design p, E(t) = y for all 

y= (Yy>++ + 2x) E Ry where t is the realized value of T for 

Vie = Yo k € S. The strategy (p,T) is called p-unbiased if T is 


a p-unbiased predictor under p. 
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for any distribution €, &(T-Y) ="Qe ‘for altiusue ob where € is 


the expectation operator with respect to &. 
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Definition. (Lif qT and T, are predictors such that for the given 


design p, EMSE(p,T,) aa EMSE(p.T,) forvalie.ere ¢, a given class 


of super-populations, then T is called at least as good a predictor 
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as T, for the design p. If strict inequality holds for at least one 
be he >» then T) will be called better than T, 


Definition. If (Pp, >T,) and (P,>T,) are strategies such that 
EMSE(p, »T,) SS EMSE(p, sT,) for all &— « @, then we shall say that 
(p,>T,) is at least as good a_ strategy as (p,»T,)- Tee serice 
inequality holds for at least one —Ec«@, then we say that (p,>T,) 
is better than (P,>T,) 
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(2.9) v(t) = EV) + Ar)? - WH 
(b) If T is p- as well as §-unbiased then 


(2-10) EV) Hk Ptr) — WT): 
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In the next two sections we are going to discuss various 
predictors under design oriented super-population model and design- 


independent super-population model. 


§2.4 PREDICTION UNDER DESIGN ORIENTED SUPER-POPULATION MODEL 


Since the publication of the paper by Horvitz and Thompson 


(1952) the estimator T well known as Horvitz-Thompson estimator, 


HT’ 
is considered in traditional literature of survey sampling as the most 
attractive estimator. Though it possesses some good optimal 
properties, but after development of super-population ideas it has 
lost some of its attractiveness. The Horvitz-Thompson estimator is 
defined for any arbitrary design as, 
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where a is the inclusion probability of unit k = 1,...,N.. Basu 
(1971) suggested a modified form of Tat which is known as the 
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Given an arbitrary vector e = (Ee, 9+ ++9ey) and a design with 


inclusion probabilities Oe > 0, k=41,...,N, the generalized dif= 
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The estimator Top has the following properties: 
Ct) Top is p-unbiased; 
(ii) Top has zero p-variance for any value y of Y that 


satisfies (y-e)<a = (A sees), provided that p is 
a design with fixed effective sample size, n, 


abbreviated as FES(n);3 


Cit) Top is —-unbiased for any model if Qe. sane. 
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(v) Top is origin and scale invariant. 
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The predictor T fe has the following properties: 
(i): ZT A is €-unbiased and pé-unbiased for any €& satisfy- 


ing model G_ and for any FES() design p. 


r 


(ii) Topo is p-unbiased under p = Py? but for any arbitrary 


FES(m) design fF is not necessarily p-unbiased. 


GDo 
Let us now consider the predictor T such that T « x. 
the class of all p-unbiased linear predictors of vas Hence, E(T) = Y 


and: T.- is: of the form: 


: = + ‘ 
st) * Wos ) Wes “k 


Theorem 2.1 (Cassel et al. (1976). Under Model Co and for 
1 ; 2 
th 


A= a 


> 


2 
eel jiClH ia) o 
(2.16) SuCaeria CoD ro) 


n 
for any strategy (p,T) such that p is an FES(n) design with 
Cer ree. TK =e das Se git Pate ee 2 equality holds if and only if 


(ty = (prt) 


0” GDo 
O 
Remark. A strategy based on the p-unbiased predictor Tat as in 
(2.11) can never, under model Co be better than the strategy 
(PT en9?* 
0 


In the light of above theorem, in general, under model Gr 
it is advisable for an optimal predictor to use a design giving large 


inclusion probabilities of units considered by the model to be highly 
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variable. However, if all Y are assumed to have equal variances, 


k 
that iss; a. = Peoria. ote ay lS. NS then (Poth? is the best 
strategy, where Peer Pts) is such that oy. = Pe 8 for all. ki = AANai Ne 


satisifed for example in the case of simple random sampling and 


C227) Tho = Yo +b- b, - 
the well known difference estimator, where b = = ) bye and 
S 
(2.18) €ve.1) = Gwapd 
é o «Do n 


Under stratified random sampling, we know that if we have 
optimum allocation, the number of samples to be selected from a stratum 
is proportional to the variance of that stratum. That is we select 
larger numbers of units from a stratum having large variance. So, 
stratified random sampling is also a technique of unequal probability 
sampling for wnits not in the same stratum. In Horvitz-Thompson 
strategy we also give larger selection probability to units for which 
the variance is large, but here the problem of optimization was attacked 


from different angles only. 


) 


We have mentioned in Theorem 2.1, that the strategy (Po? Teno 
is optimal in the class of p-unbiased linear estimators. But if we 


have an exchangeable model, then as mentioned in Theorem 2.2 below, 
oT ano? is also optimum in the wider class of p-unbiased estimators. 


We do not have to adhere to linear estimators only. 


Lemma 2.2 (Cassel et al., 1977). Let p be any given FES(n) 


design with Oy. >-Oye k= Toe N.. Then, under Model E> 
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; 2 
(2.19) €x(r-i)? > GEC =H)” = ens , 


for any linear or non-linear p&-unbiased estimator T of Us equality 


holds if and only if T= Tone’ 


N 
Theorem 2.2 (Cassel et al., 1977). Under Model Ee det ting, .Ay= =) S, 
SS UEEEEEEEEEEEEEEn ie 


_ Q-p) G-£a) 0” 


(2.20) EVO CCEv@ ot.) = . 


Tor any stratesy (p,1))\ such that pis: an” FES (pn) “design. with a. *>"0; 
Ke" leery ht and cf the class of all (linear or non-linear) 
p-unbiased predictors of Y; equality holds if and only if 

(on T)ic= @o>Ten.)? where Py angst are given by (2.13) and 


GDo 


(2.14) respectively. Oo: 


Theorem 2.2 is also true for random permutation model ERp' 
An extensive investigation of random permutation models has been done 
by Rao and Belhouse (1978). Using generalized random permutation models 
and general class of linear estimators of finite population mean, Rao 
and Belhouse have shown that many of the conventional estimators are 
optimal in the sense of maximum average mean-square error. They 
investigated optimality under the following sample designs: unistage 
design, stratified design, post-stratified design, double sampling 


design, sampling on two occasions and two-stage sampling design. 


§2.5 PREDICTION UNDER DESIGN FREE SUPER-POPULATION MODEL 


Most of the survey statisticians, who believe in the model 


based approach of estimation in survey sampling argue that p-unbiased- 
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ness 1S an unnecessarily heavy restriction, and instead §-unbiasedness or 


possibly p&é-unbiasedness should be required. Thetr opinion is, average 
of (t-¥)? with respect to design p is a matter of presampling interest 
only. In this section, we shall deal with the design-free model based 
approach of prediction. Here the distribution § is-the essential element 
of inference, where s is treated as given, giving less attention to 
design p producing the sample s. So, our object is to choose T, ‘ 
for any given s, to minimize é(r-%)7. The average with respect to 

p is of secondary importance. It turns out that the predictor T that 
minimizes é(1-¥)* for any given s is also the predictor that 
minimizes € 51-7)? for any given non-informative design p. 

Here we shall assume that super-population distribution ae 
depends on certain parameter (or parameter vector) @« 9, which is 
unknown. Once we can specify ane the method of prediction of Y 
becomes a classical inference problem. 


For an arbitrary set s « ed, let a = ca 9 be the marginal 
b) 


distribution of Ye +e where ky Re es Seas) is an 
i v(n) 


enumeration in increasing order of the labels k «és and let 


E ma be the joint conditional distribution of Vie? kere 
s/s s/s,0 


(taken in increasing order of k) given Y Se ~apbe ty the 


ee k 

aL v(s) 
corresponding density function be g(y/®), 8, (x,/8) » zn (y,/8)- 
s/s 


Note that if &§ is an exchangeable distribution, then’so 


are —€ and € ~~ let E, € and om be expectation operators 
s/s s/s 


associated with ¢, Ss and &_ respectively. Now, if p is non- 
s/s 
informative, which we shall in general assume, then the operators &é 


and E may be interchanged, that is 
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(221) €vsE(p,t) = Ge(t-¥)* = £ @(t-¥)* = E enor (t-¥)? 
S75 


Here our objective is to minimize E (1-7)? for any sample s. So if 
‘ ‘ PUSS &(7-¥ 2 ed 

we can find a T* which minimizes (T-Y) fer any s€é ,. and 

p is any non-informative design, then T* also has the property of 

minimizing € MSE (p,T) for any given design p. Alternatively, if T* 

is such that it minimizes E(7-7)?, then in the presampling stage we 

can look for the best design p which uses T* and minimizes 

EMSE(p, T*) torudifferent..p. 


Let the population mean 


(2.22) eee cre veer, 
Ss 
v(s) = i 5 dy i y 
= = v4 Cage = ao agen 
Gaeta Na), sew vielaa a (ty cate Nev(ah <i ka 


realized the value y of Y and this can be expressed as 


C2823) y 


y it O=t)¥ 
fy, ( 27 


In this representation of population mean Voletne tirstepart, Giathe 
right hand side is known due to sampling. So, Basu (1971) suggested 
that attempts should be made for a post survey estimation of the 
unknown part y_ - ' But this idea is’ criticized bythe decision-theorist 
as here the ee is selected after observing data. 

Let. -U. be arpredictor® of an ; 2. then it followset roms (2.22)", 


Ss 
for any given sample s, 


24) Le=eetoy + (=f 0 
S's Ss 


is a predictor of Y. Since s is given, the distribution U = u(D,) 
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and T(D,) = £ Y) + (1-£)UQ@,) depends entirely on €&. In terms of 
U,..the® MSE, \ can. be written as 
(2.25) Ene De Et (IKE. )7 ee) 2 ued. 7} 
Ss ae = 
S/S S 
If §€ is completely known, minimum & MSE is obtained if, for any 


given s, we choose 


(2.26) Dime Ce x ee) 


cums 


However, if § depends on the unknown parameter vector 8, then at 
first we shall have to estimate @,,:-and: then: attempt Co» predict 

‘e Note that T is €-unbiased for va if and only if, for every 
Ss ef » U is &-unbiased for ie : 


Let us now consider some §-unbiased predictors. If we relax 


the condition of p-unbiasedness of the last section and impose the more 


loose restriction of €-unbiasedness then we can find predictors with 
smaller mean-square-errors. This is demonstrated by the following 


theorem of Cassel et al. (1977). 


Theorem 2.3. Let p be any given design. Then, under Model Gn» 


Oho e En(t-7)* = En (T*-¥)? 


a9 


where T is any linear —-unbiased predictor of Y and for any s eted. 


(2528) T* = ee + (1-£ .) (2 a_ + b_) 
s s 
where 
Cia b) 
= i: kik 
= We = —EE—EE 
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Next, let us consider the predictor T* under various special 
cases of Model Gr 


(i) If under Model Co Dee =O arora lia kreeli.s .N.o8 then 


k 
(2.30) re ety \ti(a BER EG) 7 
s N E k k 
where Ze 1s: as; in. (2.29). and zy = Y fay ~SViL pe iscan, FESCn) 
design, then 
(2031) Te er aioe Y (a,-a_)Z 
HTo N r Keacenk 
where Tato = } Y/ (ma, ) dl 
(ii) If under Model Cr» ant ie fOLraL ie “kee (Enen 
eat eee ations 
C2232) T yy +b De 


the usual difference predictor. 


(iii) Finally, under Model Gro? 


Tx = Y 
the sample mean. 
For the sake of comparison let us discuss the intuitively 


appealing predictor, 
(2539) Tie Hh TRH Om, 


This predictor has the following properties. 
(i) T° is E-unbiased predictor of Y under Model Go 
Cid) T° minimizes, under Model G,, for any fixed s, the 


-.2 : : : 
criterion E(t u), among linear §-unbiased estimators of the super- 


population parameter u = utd; hence T° would be preferred if 
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inference were directly to the super-population and not to the realiza- 


tion Yo oe 


(dai) It pis van FES (rn) "dest on ft = Tope given by (2.14). 
Civye" Lf wat bE = 0 and p is an FES(n) design, then T =T 
From the above discussion, it comes out that T* and T° 
are both optimal, but by different criteria. If the criteria is 


min €MSE then T* is optimal and better than T° to an extent as __ 


shown in the following theorem (Cassel et al, 1977). 


Theorem 2.4. Under Model Gre for any design p, 


<i z. 
E{ ) (a, -ag) }(1-p)o 
o =.2 = 2 S 
(2.34) CN UAE a (NONE rer tower ep ceemrenrerearcmree ata) 1? 
N 
where T* and T° are given in (2.28) and (2.33). Moreover, 
2E (A-a_) 
Oye e2) i A 2 
a = —— - =} + ———"- - 
(2.35) €& E(t Ay) [Etiam iy! = ] (1-p)o 
- 1 eee. 
where aq = v(S) ; a» and A= N d a - Strict inequality holds 
mk G234)) al fiep:(soe > 0; forasomes ss such that, not,.ali ay for 
k €s are equal. qo 


The comparison given in Theorem 2.4 hold for any p but 
neither T* nor T° are necessarily p-unbiased. However, both are 


E-unbiased under Model Gre 


E G assume that Y ny, are 


Models G R? MR 1? eee nN 


Pie 


independently distributed. Under the assumption of independence, the 


optimal —-unbiased predictor is given by the following theorem. 
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AHeOreMm Ze. ) (Cassel et aiy..i09)7).° Let  p be any given design, let 
in fay ere ire) ander err YS (i-f UY “Beal two &-unbiased 
Ss Ss ss s 

predictors of Y. Then if er 1S a product measure (as under Model Gor 
Pr? Ga: on the inequality 

Ex(r-¥)* < €xct'-%)’ 
holds if and only if, for any s « eh suche that <p(s) >3.04., and 

a é Sez, ' woe 
TO = €G-1). =. 90") = &(u'-1_) 
s s 


Etytor some “sy with: “pi(s)) ~ 0) the latter inequality is strict, then 


the former inequality is also strict. 


O 


A similar type of result was derived by Fuller (1970). In 
view of these results, much of classical parametric estimation is 
relevant to finite population sampling. If we know something About 
the shape of the distribution it is possible to construct predictors of 
Y which are more efficient than sample mean Ye For example Fuller 
(1970) proposed simple predictors of Y when the tail of the distribution 
is well approximated by the tail of a Weibull distribution and Ringer, 
Jinkins and Hartley (1972) proposed a square root predictor for a 
positively skewed population. 

Much of the literature on super-populations contains 
discussion on models G and Cur’ We have already mentioned in 
Chapter 1, that idea of super-population first came from analysis of 
ratio and regression estimators. The following theorem due ta Brewer 
(1963) and Royall (1970b) is the most important result under Model GR: 
The PheoTen is true for any design. p and gives the best linear 


E-unbiased (2 -BLU) predictor of vee Here the best 
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is in the sense of min E MSE. We shall denote this &-BLU predictor by 
Tar’ It also comes out from this theorem that TaR does not depend 


explicitly on design but its MSE does. 


Theorem 2.6. Under Model GC.» and for known auxiliary variable 
measurements x. in OSs aa aiensieie 4s): CEs <e- DEL, epredtictor of Y is; 


for any design p, given by 
; = £Y + (1-f )& = 
(2 oe Ak ae (1 £8 a 


where 


Q237) g = L u(x,) } ace) and 
(Ce) = oe u(x, ) : 
Furthermore, 
Er } =) 8) + a ) u(x, )} 
S S 

(2. 38) EMSE(p,T,,) = , 

BR NZ 
where, VAC.) = Be / ) (x /u(x,))- 

s 0 
Special Cases: Let us denote Tar by TERe 2f. uo x®, IF 


u(x) = x®, as assumed in many earlier literature of survey sampling, 


we have the following special cases. 


Goal f GU Gk). Se aed Oey aoe el then T aR = TR? the 


classical ratio predictor 


ed retaabysg. UNe3 ated eponsh, the ot Be) eka: ae 
hnsyab 308 aneh ag? tad? ec, abd. mos ‘too ania 
| yaseb, SUS ast-aud pn 
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re h ie ; 
Si dettey akLinds ayn to? bur a isan i 
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ats ‘ Te sve. had a a 
Bi 


x Yo 
(2.39) ™ 7 
x 
Ss 
with 
- = 2 
xf (Ne fo) ae 1}o 
S 
(2.40) é MSE (p,T,,) = ; 


The predictor Tp is €-unbiased. 


(GG oO) Wid Ee auch Gg) a es Pen pte 2 then 


2.41 = Xx +f£ (Y - x 

( ) T3R2 ts eee fee ngs ee 

where 

(2.42) pe ae ee 

yx v(s) a kK! *k 

nx. 

If p is an FES(n) design, i.e., v(s) =n and assuming ay aera 
Nx 


in the Horvitz-Thompson predictor Tt of (2.11), then we have 
Tar o xRy Royall (1970b) has shown that if (i) p is any FES(n) 
design, Cit) nx, /Nx Selrrerorag ki be aaeyNy > and, (i114) He is 


a non-increasing function (usually QO < g < 2), then under Model Ga» 
€ MSE (Pp; xR, ae EMSE(p,T, 55) 


or in the present form of Tar? 


EMsE(p,T..) 2 EMSE(p,T, 9») : 


It is clear from the above that EMSE of the strategy 
(p»T,.) depends on p through E(.) in (2.38). A pre-sampling 
judgment may be required as to how p_ should be chosen such that 


EMSE(p,T, 5) is minimized. Under model €, after the sample has 


already been selected, the inference problem is simply the classic one 
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of predicting unobserved random variable Ys and the sample s_ should 
be one which permits a good predictor. This idea of Royall (1970b) has 
been criticized by some authors for adopting purposive samples. 


Expression (2.38) can be rewritten as follows: 


; ae i Pe 
(2.43) €MSE(p,T,,) = E{ E(T,_-¥)°} = 2 BIC m)° E(6-8)"+0" J aC, 1. 
S S 


Now if our objective is to find a design p for which this is minimum 
then we have two options; 

(i) to select a sample which will give a good estimate of the 
expected value of the mean of non-sampled units,i.e., to choose s_ so 
that | 

+ (L a)* 6-8)" 


N — 
Ss 


is small, or 
(ii) to observe those y-values which have greatest variances, so 
that only sum of the least variable values are to be predicted, i.e., 


to choose s- such that \ u(x) is small. 


Ss 
So it turns out that for wide class of variance functions, 


the optimum strategy is to use T with a purposive sample s of 


BR 
FES(n) which contains the n largest x-values of the population. 

Formally, let i = {s : v(s).=n} and s* be the set of 
labels such that 


(2.44) max ) = j) ; 
book ee v 


sx 


and let the design p* = p¥(s), such that 
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(2.45) p*(s) = 
0 2 Be Ss 7 s* ; 


Then the theorem of Royall (1970b) follows: 


Theorem 2.7. Let p be any FES(m) design, and let p* be defined 
by (2.45). If u(x) is non-decreasing and WAI is non-increasing, 


then, under Model Gp» 


€ MsE (p,T) eA & MSE(p*,T 8 


BR 


where T is any linear €-unbiased predictor of ve and Tar is given 


Dye 2, « OO) ce q 


Use of this type of extreme design is open to much 
criticism. J.N.K, Rao (1975) points out that there are, no doubt, 
situations inwhich the extreme design p* can be highly efficient for 
prediction of one y-mean. But in most of the surveys, we also estimate 
mean values of other characters. In such situations extreme sample is 
not likely to work well if several means have to be estimated in the 
Same survey. So, it is preferred and safe to use simple random sample 
in the case of multipurpose studies. 

It is also obvious, from the above, that the result depends 
too much on the assumed model. If Model Ge is not true, that 
phase Laie aCe) = 8 = y m?7i1 (m=1 in case of G,)> then 


E-bias of T, is 


N N 
9 St rar m m 
CEST A 9 Va aLLC) x)! } Se a } x) / d x. 


Simple random sample is likely to give small bias in such cases, but 
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extreme design p* is supposed to produce higher &-bias. 
Results based on Model Ge can be easily extended for 

Models Cur’ Various results under Model Cur have been given by 

Hartley and Sielken (1975), Royall (1976), Royall and Cumberland 


(1978a) and Tallis (1978). To present some of these results, let us 


introduce the following notations. 


El) a) X. Be. eo) x40 


Ss 


oe) 3? ae (AGE UE: a7 V : Cy .Y_) = 0 
Ss Ss Ss 


where. YS is v(s)-vector, i.e., vector of sampled y, values , kee. 8% 


‘ae is -{N-v(s) }+vector, having non-sampled Y-values as its components. 
Ss 
Let us further assume that in both cases y, 8 are enumerated in order 


of increasing k. B' = (Bigcce2 8.) is a vector of unknown parameters. 


Known matrix X, and Y_ are of order v(s)xq and {N-v(s)}xq 
Ss 
respectively. Let the row vector corresponding to unit k of X, or 


t — = 
* be denoted by X (222s Xa) where Xe 1, for all values 
of k. So we have (q-1) auxiliary variables, Agere 2%eq measured 


on each unit k of the population. Diagonal matrices Ne and V_ 
Ss 
are of order v(s)xv(s) and {N-v(s)}x{N-v(s)} respectively. The 


diagonal element of unit k is us» a known quantity. Hence 


Zon et 
where oO is unknown. 


2 
AoE) =o uy - 
Under Model Cur following theorem due to Cassel et al., (1977) 


gives the €-BLU predictor. 


Theorem 2.8. Under Model CR? and for known auxiliary measurements 


X, and X , the § -BLU predictor of Y, for any design | p,'+1s: 
as 
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(2.48) TRELU ae ae _ ee "s PRL ( 
where, m' = (m_ yeee gM a eand. £0r traced 
Ss si sq 
a ) x, /W-v(s)) : 
si ~ 
s 
Moreover, 
: a a - -1 -1 La 
(2.49) Be akon Xa). (KUVFays 4° 
EMSE(p,T,, ,,) is equal to p-expectation of 
Z on iz 1 uh z 
: ~Yy° = = + (1-£ )“{m'(X'V_-X_)7 
(2.50) E (Tgp y¥) ee ea ett nim OOVEGX Vim bor) 
N = s s 
Ss 
a 


It is clear that under models G and Cur? €-unbiased 
predictors are weighted least square estimators. They do not depend 
on any particular design. On this point Scott and Smith (1974) says, 

"The fact that the estimators do not depend on the design 

P(.) May worry some people, but it seems to us that when prior 
knowledge is so strong that it can be specified by model of 
the form (1) (simple linear regression model) then the relation- 


ships expressed in the model should override the sampling 


scheme for certain purposes." 


Obviously, model based inference depends very much on the 
model assumed. So the natural question is, what will be the behavior 
of optimal predictors if the assumed model is not true or deviate 


slightly. This leads us to study the robustness of predictors. 
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§2.6 ROBUSTNESS IN MODEL BASED INFERENCE 


In real life, it is not known which model is producing our 
actual population. So, whenever we have doubt on the assumed super- 
population model, the correctness of results established in the 
preceeding sections becomes questionable. Royall and Herson (1973a, b) 


first discussed this problem with a polynomial regression model: 


C25) Ye = h(x, ) + ey, ee De can 
where 
fa 
Lee. h = de Biz 
( 2.52) (x,) lL 3,8, 
j=o 
and ie = 0 or 1 depending on whether the term xJ is present in 
the model or not. Also e's are uncorrelated random errors with 


Zz ; 
E(e,) = 0 and V(e,) =o u(x, ) A k =1,...,N. They denoted this 
model as Cee :suCx)). 5 Invour present notation this is a 


special case of Model Cur’ In particular if 99 = 0, eT =] 


and a5 = Soe a5 = 0, then the above model reduces to EiCOe tes (x) 


which is Model Gp: Let us consider the following two cases: 


Case 1. Misspecification of variance function under Model G.° 


The Brewer-Royall predictor, T as defined in (2.36) and 


BR’ 
(2.37) is §€-BLU predictor of Y under Model GR: The form of T aR 


obviously depends on the specification of u(x). In previous sections 


g 


we also introduced Model fo which is Model fe with u(x) = x and 
g 


the corresponding €-BLU predictor is Tare" If the assumed model is 


is supposed to be optimal. But, if it so happens 
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that the true model is Cre? B1 # By then in general TREO is 
no longer most efficient, although it is still &t-wnbiased. In this 


case the prefered predictor is T For fixed values of By and 


BRgl” 
81> neither of which is necessarily true value of g, the following 


theorem due to Royall (1970b) gives us indication for preference for 


one or the other predictor. 


Theorem 2.9. LimOgs By < g then, for any FES(n) design p, and 


fig? 
for any specification of the function u(x) in Model G. such that 


Uae 
u(x) /x is non-increasing, 


Bits 


< 
ee E MSE (P,Ta 31 


(2.53) é MSE(P Tana 


& 
For any function u(x) such that u(x)/x 1 is non-decreasing, the 
inequality in (2.53) is reversed. For strict inequality in (2.53), 
itis sufficient’ that p(s). > 0: for some —s such that ~v(s) =n, 


Qo apd A 
and x # x) for some k # in s 
Case 2. Misspecification in polynomial regression model. 


Theorem 2.10. (Royall and Herson, 1973a). Under the model 
E(Oqrdyoee9d5 : u(x)) and for known auxiliary variable measurements 
x, > 0, k=1,...,N, the §&-BLU predictor of Y is for any design 


p is given by 


(2.54) Tygri tan taClné~) bh 3.8 


where, for j = 0,...3;J; 


: i 


a! x lesonms at. eas Pee ry 

aida il beatae} Dh dee wet Hhyslggodsie ong 

: an 70 Saelev best? set sue selmi aide: rien id 
avuiertiot- t83 <9 2 sale apt Lai aenmiad tk todty to 3 2 

ol sunestadese 2c: ots SOBRE: ay 2ov ty tenia Stecgoat “on” ret 


a Se a Se 


— 7 


be .Q: elem. (CONT eee S02 she wh 2 gt = > Oo. %. B, 
je4a dota Pas sS0mt nt tale oS tans err “ nano X 
be 

(Téieaarsatroog at 


. bait : aie IOS 


fa .aetvasipsb<ton iat * (nde Sad? dae ne ane 
CoS) th) Po Pimopaws Shieee: 16 neevis at: cores) ae 
x ted shds ice @) Jaded “Os oss, ‘an ia 
| Re a 5 
. | ae . 
ae & as A a oe 


-Letonvnok baer ga Le bibatetageah) 
, a ogee > sf : : ‘Y 
a? 8 - 


Labret ‘Seba - «cable ‘seh ills 4 
vavagutasiogg aftatter’ cael 


ay bias i eens 


41 


C2255) m = j xd / (N-v(s) ) 
sj = 
Ss 


“a 
and ae are the least square estimates of BIS under the model 


E(d, 59 


0 prrrr og > Ga) 5 q 


This theorem gives §-BLU predictor of Y in the 


t 


situation assuming that 8. Ss are estimable. 


Royall and Herson (1973a) analyzed robustness using model 
&(0,1 : x) and corresponding predictor, poe ie =x YX,  UBY: 


Theorem 2.9, this predictor is €-BLU and is the classical ratio 


estimator. If, however, the alternative model §&(9 vo9 dy Fan x),) 


0? 12" 
is true, then the preferred predictors for any design, is given by 


Theorem 2.10. Moreover T. is biased under model E(8psee+2d5 Se r):) 


mee( lt x). Thesbias ot. TT.) 1s 


R 
(2.56) &(1,-Y) = " a8, m, {(m,, fn) - (m, /m,)} 
where 
N 
sail ee j et j 
"sj ~  w(s) os Bic "5 N QL x 


ht is. clear from (2.56) that &-bias is zero if 


x 


(2557) m., /ms4 = /m, 


rorvallenc “such that a = 1 in the model E(8qs-++995 u(x). » The 
idea of balanced sample comes in survey sampling, from this relation 


Royall and Herson (1973a) defined balanced sample as follows: 


Balanced Sample: A balanced sample denoted by s(J) is a sample 


_ 
- 


D i J = ; 
vies at: ' a a A Fr 
2 uf Panne 
j i-. : | \ alta Vue ; 
a r . 
: 7 
‘c woaweedees ov» pe a haeeas ap it satel 
ni... 
va § om, 
i ~ aR Ss as cio > 
es Yo <4) 
. 4 > a 
vid | | rer) TS cetig oe Teens on rs 4 
cag 
sar gt eds, 2°. 3 Sad eel ween aorttt 
‘. : rae 
PRETO conte beh tage <0 2 7 
/ : 8 
= i fe “ _ i) 
F : a leat aoyes sys Dinw (x 1-1 O73 
: . : : a y 7 — 
4 7 bk rong Metg zis? 2 exe oAiT 
| 7 ; ) 7 : : 7 
ae Ly ys 
be é a 
i =, ae 
a 
°, mw ie en in i | » 
; i= 


— 7 P - - ca | o30s 
_ 
- -_ a _ i 


eer aie Dae. 

| a ee i > da _ 

Peme og. Ye te 

— é “wit ve 5 gbiwnee p62 Keb 
7 - 7 ¥ pa 

“lar etd sere ‘EWS amen | 

_ hd he a on 


| 


Stir fd Os oS oils inf 


Sig: Os 28 Soxomaty ot 


seal Fis at 


— 


> 


42 


Saciatying.(2.5))) 10raut) Selec, Js4 thatuts: -s(J)1iis. such! that 
(2.58) m ary iy Me a at ae eo 


A sample s_ such that (2.58) holds for j = aye < J is said to be 
balanced on the Ae moment. A design p which selects, with 
probability one, a balanced sample will be called a balanced (sample) 


design and will be denoted by pb 5. q 


It is difficult to get a sample which is balanced up 
to ath order. Simple random sample usually gives an approximately 
balanced sample. We shall discuss methods of approximating balanced 
samples and their alternatives in Chapter 3. 

Using a balanced sample we can eliminate the bias incurred 
by the ratio estimator TR? Lit theractualwmodel is miE(1,2 2° 1(x)). 
But to have this property for the estimator, he shall have to 
compensate for efficiency. Assuming FES(n) design, Royall and Herson 
(1973a) compare the balanced sampling strategy RaAL = (pb, >T,) to 
Rapp = (P*.T,) which has mimimum EMSE under £&(0,1: x), where 


p* is the optimum design as given in (2.45). We have 


EMSE(R yp) = min, (%/X,)(I-f)%0°/n 
s< cd. a) 
and 
EusE(R,,) = (I-f)x on, 
where 


fale ete e ey (s)u= men 


Therefore, efficiency loss is the absolute value of 
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(2.59) min weoew sens = he < OL 
gehen iat 


Royall and Herson (1973a) have given some numerical results on efficiency 
for different types of populations. General conclusion of their study is: 
Shape of the distribution is less important factor in determining 
the efficiency than is y, the ratio of extremes of the distribution 
with finite lower and upper limits of range. Another result is 
that the protection against €-bias is often costly from an efficiency 
point of view. 

The balanced design, Pb> protects the predictor TR 
against E-bias which: would-be; incurred if “(1,1 = u(x))- ‘not 
£(0,1 : x) is the true model. The most attractive property of 
balanced sample design is that if conditions pag = a eee ste celely 


are satisfied, then T is protected against ¢-bias under any model 


R 


ECdgee++995 : u(x)).° There is no additional loss of efficiency of 


(pb > To) relative to, (RK It is also observed that T. reduces 


R OPT” 
to n under balanced sampling. Royall and Herson (1973a) have also 
shown that if T = TCdgae20995 : u(x)) denotes ¢-BLU predictor given 


by Theorem 2.10, then under balanced design Pb 3» 
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The idea of balanced sampling is also extended for the 


classical regression predictor, 
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with 
B Feist) nga eK) Yin Pofeiol J, Hemet) 0, 
Ss Ss 


which is €-BLU for model Eqi sl): x)... Now, 7 if alternative model 


Ed: S09 seeesd, > u(x)) were actually true, then in general T 


0 REG 


is €-biased. This bias can be removed if we choose the balanced design 


pb For any balanced sample, the predictor T also reduces to 


Aly REG 
sample mean Y.: 

Balancing the design is on the average equivalent to 
p-unbiasedness, and the prediction sought by balancing eliminates the 
efficiency gain realized under model based approach if we were willing 
to accept an extreme, purposive sample as a basis for inference. 

Recently attempts have been made by many authors to compare 
and if possible to mix both design based approach and model based 
approach in survey sampling. Royall (1976a) and Scott and Smith (1969) 
have applied super-population model to two-stage sampling. Scott and 
smith derived results by using Bayesian techniques and established 
optimality among linear unbiased estimators. Royall (1976b) has 
Studied linear least square prediction approach in two-stage sampling 
and then used a probability model to analyze various conventional 
estimators and certain estimators suggested by theory as an alternative 
to the conventional estimators. Sarndal (1978) has compared two 
approaches for estimating population mean. He showed that several of the 
conventional results can be obtained and reinterpreted through model 


based theory and found that the model based framework often offers 


advantages over the design based one when it comes to present a lucid 
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argument in favour of some given sampling procedure. Thompsen (1978) 
has given some examples where super-population ideas in survey sampling 
were applied to different surveys in Norway. Empirical studies of 
prediction theory has been done most recently by Royall and Cumberland 
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CHAPTER III 


RANDOMIZATION AND BALANCED SAMPLING 


§3.1 RANDOMIZATION 


Randomization is a well-known and widely used method of survey 
data collection and analysis. The main purpose of this method is to make 
objective inference and presenting results of survey in a convincing way 
to users. Keeping these and other advantages in mind, under varied 
population structures, survey statisticians have developed, in past years, 
‘different survey designs and hence various estimators for parameters of 
interest. These design based inferences are still overwhelmingly in use. 
Since 1960, design based inference has faced new challenges. This has 
been briefly discussed in Chapter l. 

The method of maximum likelihood is still one of the most 
important ways of estimation in statistics. However, for a long time, 
likelihood method was essentially a failure in survey sampling especially 
under design based approach. For any design p(.) and for any 
population vector y = (Yyorrs Vy) treated as a parameter, the 
probability that the random quantity Dy will take a value 


d= {(k,y,) kee eh ee Sod Ven Dy 


p(s) if d is consistent with jy, 
or-ift -y*e Qs 


(3) iat (d) = 


0 otherwise , 


where a specified value d= {(ksy,)5 k e s} is said to be consistent 
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with a population vector ina (Yo2+++ Von)» iP and “only if; sen dale 


£6r“aLTV k*ets “= {kj,-+-5k I, the sample. 24 Terchne sec ot all 
te Ry such that d is consistent with y. 

Tt follows from (3.1) that Pr(D, = d/S mas enor [eRe YE) iis 
consistent with y, and zero otherwise. If our interest is for the 
parameter y and if the design is uninformative, then from (3.1) we also 
find that the likelihood function L(y/d) = P. (d) is independent of 
y- That is likelihood is flat, so every consistent value of yours 
equally likely and no unique maximum likelihood estimator is available. 
Likelihood function of the form (3.1) which is not informative in 
nature was first studied by Godambe (1966). But with super-population 
model at the back of the finite population, the appropriate likelihood 
function may be more informative, Royall (1976a). In view of (3.1), when 
the likelihood principle is applied to the survey sampling under fixed 
population approach has the following two consequences. 

(i) Inference from survey data should be independent of the 

sample design. 

(ii) The only inference about y sanctioned by likelihood principle 
is the trivial one that the components Vy fork eis» must 
coincide with the observed values. It does not admit dis- 
crimination among the possible values of the unobserved 
components of Y> since all the values of y« 4 have the 


same likelihood. 


However, with a somewhat different point of view another likelihood 
function emerges which can yield a maximum likelihood estimate of y 


under certain conditions, Royall (1968) and Hartly and Rao (1968, 1969). 
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Very interesting and detailed discussions on likelihood 
function, sufficiency, randomization, etc. for finite population sampling 
are given in a series of papers, namely Basu (1969, 1971, 1978). Basu 
wrote in the summary of his 1969 seoetaane examine the role of 
sufficiency and likelihood principle in the analysis of survey data 
and arrived at the revolutionary but reasonable conclusion that, once 
the sample has been drawn, the inference should not depend in any way 
on the sampling design. This poses the problem of designing a: survey 
which will yield a good (representative) sample. The randomization 
principle is examined from this view point and it is noticed that there 
is very little, if any, use for it in survey design." In this design 
based approach of sampling theory, as we mentioned earlier, there is 
only one source of randomization in the data. The artificial 
randomization created by the sampler himself is not inherent to the 
problem. All results of conventional theory are based on this 
randomization. Basu (1969) suggests in the Bayesian point of view: 

Once the data d is in our hand, forget about the sampling plan 
(ef, p(.)), which is an artificial source of randomization. In the 
Bayesian plan for selecting the data d, there is no place for 
symmetric dice or random number tables. But, unfortunately, until 
recently sufficient attention has not been given to the problem. Basu 
suggests that any reasonable Bayesian sampling strategy would have 
the following characteristics - 

(a) The sampling plan would usually be sequential. The statistician 
should continue sampling (one or a few units at a time) until he is 
satisifed with the information thus obtained or until he reaches the end 


of his resources (time and cost). His decision to select the units for 
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a particular sampling stage would depend (non-randomly) on the sample 


obtained in the previous stages. 


(b) The probability that the statistician would end up observing 
the units s = (kj sko5++-5k) in this order, would depend on s_ and 


the state of nature y: This probability would be degenerate Les 


zero for some values of y and unity for the rest of the values of y- 


We have already mentioned. that viewing the likelihood 
function from a different angle some authors arrived at different 
types of likelihood which readily yield a unique maximum. But it is 
to be noted here that in all those likelihood functions they ignored 
the label part k of the data d= {(k,y,) >: k e s} and considered 
the unlabeled data d_ = eee ik GAS) ay basi 19.7.1) pointed: outsthat 
the label past k is an ancillary statistic, that is, sampling 
distribution of the statistic k does not involve the state of 
nature..y i= (Ypo++s¥y)- The sampling distribution of k = (k,>-+++sk) 
is uniquely determined by the sample design. It is therefore obvious that 
the label part of the data cannot, by itself, provide any information 
about y. Knowing k, we only know the names (labels) of the population 
units that are selected for observation. Usually, we incorporate the 
prior knowledge of the auxiliary variable x = (Xp 5+++5Xy) in the 
sampling plan. But this does not alter the above situation. The 
label, k of the data d will still be an ancillary statistic. Now 
the question is: If the label part k is informationless then, does 
the observation part of the data, namely, d.; contain all the 
available information about y? Basu (1971) answers this question 


with a definite 'no' and says that a great deal of the information will 
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be lost if the label part of the data is suppressed. Without the 
knowledge of k, the surveyor cannot relate the components of the 
observation vector y to the population units and so he cannot make 
any use of the auxiliary character cas (X]5+++5%)) and whatever 
other prior knowledge he may have about the relationship between y 
andi x 

Basu (1978), has given a counter examplé (Example 4.1) where 
the optimum sampling plan would be sequential and non-randomized. In 
that example it is very difficult to get any justification for random 
sampling. Randomization is deeply rooted in statistics which is quite 
difficult to ignore with some counter examples. In view of the above 
mentioned statisticians, the main use of randomization is to safeguard 
the sample against unknown biases. Like the conventional approach, survey 
design can no more be the only determinant for judging the quality of 
the data. Basu (1978) suggests that the principal determinant of how 
a particular datum ought to be analyzed is the datum itself. The key 
concept in survey theory ought to be the notion of poststratification. 
"Randomization is widely recognized as a basic principle of statistical 
experimentation. Yet we find no satisfactory answer to the question, 


Why randomize?", Basu (1980). 


83.2 BALANCED SAMPLING 


Prsently it is a general feeling of statisticians that 
artificial randomization in survey sampling should not be the only 
means of inference. Purposive sampling is now-a-days increasingly 


getting justification for the analysis of survey data. 
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Purposive samples are subjective but there are some rationale and 
objective justifications available for them. From the discussion of 
the previous section, it is indicated that we need an alternative for the 
randomization principle. Prediction approach or Bayesian approach in 
survey sampling may work as useful alternatives. These approaches usually 
lead us to the selection of purposive samples. We have already mentioned 
in Chapter 2 that under certain super-population models, optimum 
sampling strategy (2.45) or balanced samples are more desirable than 
the random samples. 

We have defined the balanced samaple in Chapter 2 and have 
discussed the robustness of the estimators under balanced samples. In 


this section we shall discuss how to get balanced samples and extensions. 


Approximate Balanced Samples 


Selection of exact balanced samples of higher order is a big 
‘practical problem. Usually, not all values of auxiliary variable x are 


known to the sampler. In such a case it is impossible to get a balanced 


Sample. Even if all values of x are known, exact satisfaction 

of ae = 2). j = 1,...,J, is usually impossible. However, when 
J~v~and the sampling fraction: +f" “are~smatl, “it~ isveasier ta-get 
approximate balanced sample s(J). It is expected that a random 
selection of units is supposed to give an approximately balanced 


a) 


N : 
sample. The average value of Ras shower ad (2 samples s, is 
x) for j= 1,2,... . So we can expect that random sample s is 


approximately s(J). Simple random sample is supposed to yield fair 


approximation to s(J) for J > 1, so when we use this approximate 
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balanced sample the ratio and regression estimators are approximately 
unbiased under sage, Bl dices regression models. The estimators will 
also be approximately optimal under models where the variance of Y 
tsa polynomial” in~ x” "ot degree J or less. It) is true that there is 
possibility of large deviations of the sample from the balanced sample 
depending on the dispersion of x values. If this circumstance arises, 
then it is advisable to use restricted randomization, censoring or post- 
stratification in the data. Royall and Herson (1973a) have given an 
expression for the extent of bias in ratio estimators using approximate 
balanced samples. 

In surveys ususally more than one auxiliary variable is 
available. It is very difficult to get a balanced sample with 
respect to all those characters. However, random selections will at 
least justify some degree of confidence that the selected sample is 
approximately representative. It is to be mentioned here that neither 
purposive selection of a balanced sample nor restricted randomization 
nor unrestricted random selection will guarantee balanced on other 


variables not explicitly considered in choosing the sample. 


Extensions and Other Types of Balanced Samples 


Up to this point we have only discussed the balanced sample 
suggested by Royall and Herson (1973a). There are other forms of 
balanced samples whose definitions depend on the super-population model 
under consideration. Some of these are direct extensions of the already 
defined balanced sample. Holt (1975) has extended the idea of balanced 


sample for a linear multiple regression model and defined the balanced 
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sample as the sample for which the first moment of each of p-auxiliary 
variables for the sampled and non-sampled part of the finite population 
are equal. Using this balanced sample he obtained BLU estimator for 
finite population total. 

Scott, Brewer and Ho (1978) proposed an alternative to balanced 
sample which they called "overbalanced sample". Their overbalanced 
sample provides more efficient estimators than the balanced sample. 
Principal results of this article followed by using the model €&(0,1:V(x)). 
Regardless of manner in which sample observations have been 
obtained, the BLU predictor of the population total Y under this model 
is 

L ¥ x, /VCx,) 


(322) T. = T.(0,1:V(x)] =a) Yo+ 
° ° Adie: : x, /V(x,) 
Ss 


7 a | 
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Let s*(J) be a particular sample for which, 


(3.3) Te 
l x. /V(x,) 


nl ois ini c7 
* 


then the following Lemma and Theorem follows, (Scott, et al. (1978)). 


Lemma 3.1. If: +s =’ s*(J),°othen To is &-unbiased under the model 


(9529, 2+20995 soVe(x) Y. Loptany CV2(x)< 


ivetact:, Ty is the BLU predictor when s = s*(J) fora 


wide class of models. 
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Theorem 3.1. Suppose s = s*(J), then To is the BLU predictor 


under the model (952945 : 095 : V*(x)) for any variance function of 
the form 
J ‘4 
WF Ge) ee (ky yy dae xe 
d=0 Sarr) 


Special cases: 


If V(x) = x then we get the Royall and Herson (1973a) 


results leading the balanced samples, so that T reduces to ordinary 


0 
ratio estimator, 
N 
(3.4) Tie eaten wae Pies 54) xt 
s s 1 
On the other hand if V(x) = 5) To becomes 
(3.5) Tf 2) wer) Y/x.) ) & 
: DY L n RES a 
s s - 
Ss 
and (3.3) becomes 
=m ; 
(3.6) ie SA) a. ie Ds 
Ss s 


Obviously, this is always true for j = 1. Scott et al. (1978) 
called samples satisfying condition (3.6) as "overbalanced". The 


2 : 
mean square error of T, Unders 6 (04.2 x5).— 1s 
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1 
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If the sampling fraction is small and no single a dominates the 
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others, the MSE is affected very little by the choice of sample and 
little efficiency is lost by choosing an overbalanced sample. 

In many practical situations, V(x) increases more quickly 
than x but less quickly than te so that E992 dy oeee 99s 2 Gx) 
with V(x) = aes Gs Hees is often a fairly realistic model. Both T 


iu Z 


with balanced sample and T 


L 


Z 


respective samples under this model and it is interesting to compare 


their performances. Scott et al. (1978) shows that MSE of T) with 


balanced sample is 


a i 


(3.5) M, = N(Nen) (af + dy 


a) 


N 
cuss Ree Y x* , while MSE for T 
N it Os 


0 ieeuonien 2- 
(39) M, = mo eo + a,x) /n . 


where x is the mean of x-values not included in the overbalanced 
s 
sample. 


It follows from (3.6) that if j = 0, then x_<x_~. So 
Ss 
that, M, > M,- Thus ratio estimator T, with balanced sample will 


be less efficient than using T, with overbalanced sample. The loss 


Z 
of efficiency will be small in general, if at dominates a, but 


can be substantial if as is relatively large. These results apply 
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with overbalanced sample are BLU for their 


with overbalanced sample is 


to any polynomial model EC Ops Opaee es Oy : V(x)) with variance function 


Za 


V(x) = aes +a x ; hence T, with overbalanced sample (j = 0,1,... 


aL ic Z 


is more efficient than qT, with balanced sampling of the same order. 
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How to get overbalanced sample: 


If the sample selection is with probability proportional to 


Xa) Chen 
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N N 
a: ? 
(3510) E( ) x; fn) = ; af x 
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This indicates that selection with probability proportional to Xs 
yields an approximate overbalanced sample if the sample size is large 
and sampling franction is small. On the other hand if the sampling 
fraction is large, then Scott -et al. (1978) suggest selecting units with 
probability equal to 


AX, 
al 


ee. T+ Ax, 


where A is the solution to the equation 


N : 
(3.12) y ———— = on 
i 


Iterative solution for (3.12) is suggested with starting value 
N 
: eae 
9 = n/ ) x, Using (3.11), the probability of not sampling the i 
ig 


element is (1+ Lots southat, stor ali 4; 


N x 
j al j-1 
(32.13) ; E( ) cae = d ea en E ( : x ye : 
z si S 
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such that, we would expect to obtain an approximately overbalanced sample, 


if the sample size is large enough. 


Royall and Herson (1973b) and Scott et al. (1978) have extended 
their definition of balanced sampling and overbalanced sampling schemes 
respectively for stratified population. Royall (1976b) has given some 
different ideas of balanced sample for two-stage sampling. Let the finite 
population consists of N elements and K cluster with M, elements in 
: the sist cluster, such that ; M. = N. Suppose, first we have chosen 
a sample s of k clusters ort then from the sample 3 cluster 
a random sample S. consisting of m, elements has been selected out of 
M, elements. 


Let, 


Ay: Sex}: Mm, 
s 
mo. J Ww /K 
i 
i=1 
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then Royal (1976b) says that the above two-stage sample is balanced if, 


(3.15) M = M Bie cue ea ee) 


This type of balanced sample gives unbiased ratio estimtors in two-stage 
sampling under a quadratic regression model and the ratio type estimator 
is best. This result also holds for higher order-polynomial models 


when the sample is balanced on the corresponding higher-order moments, 
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In the literature on survey sampling we also find a completely 


different definition of balanced sample given by Singh and Garg (1979). 
This is actually some kind of systematic sample with random start. The 
suggested balanced sample is: Assuming population size N and sample 
size n both even (for odd values N and n modification of the 
definition is also available), first draw n/2 units at random from the 
first N/2 units of the population and rest of the n/2 elements are 
taken from (n/2+1) ¢ to Sige units of the population with indices 
N+l-r., fraps? oo eg / 2 where ae is the index of the pe unit 
dzeawn. trom: first) N/2“-units. 

This sampling plan has the advantage of both simple random 
sampling and systematic sampling and works best for population exhibiting 
linear trend or periodicity. Their empirical study shows that this 
balanced sampling is generally better than simple random sampling and 
in most of the cases even better than systematic sampling and stratified 
sampling. 

It is clear from the above discussion of randomization and 
purposive sampling that there are some cases where the randomization 
principle does not carry much meaning but purposive selection like 
optimum sampling or balanced sampling etc., gives meaningful and higher 
percision estimators. It may be sometimes feasible to draw these type 
of purpositive samples, but in large scale surveys with many items the 
purposive design could lead to very inefficient estimators for some 
of the items. Rao (1975) says, "Of course, this criticism also 
applies to conventional designs such as the probability proportional 


to size sampling plans or stratification by size with a 100% sampling 
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rate in the stratum containing the units with largest Xi In such 
a situation, it might be advisable to employ equal probability 
sampling and utilize any quantitative concomitant information only at 
the estimation state." The role of randomization in survey sampling 
cannot be taken as the only basis of data analysis and inference as 


the conventional survey samplers used to think. 
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CHAPTER IV 


RANKS AND ORDER STATISTICS FOR FINITE POPULATION 


84.1 ORDER STATISTICS IN SAMPLING FROM FINITE POPULATION 


There are few works in survey sampling literature on the use 
of ranks and order statistics in estimating finite population parameters. 
It seems that Wilks (1962, p. 243) is the first to discuss distribution 
of order statistics in samples from a finite population. There he 
considered a finite population ™ consists of N distinct elements, 
say Toi. Yo9 Sie eat Yon: and derived the probability function of the 
sample age order statistics. Let s be a random sample of size n 
from this population and let us denote the order sample by 


th 


X Tralee So it follows that probability of the k 


Ge) a) 


order statistic of the sample being equal to the pth order statistic 
of the population is 
(Fee) (es) 
k-1/ \n-k 
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where® t = k,ktl,..<,N-ntk. 
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t = ki ktl,...,N-ntk, or (41) the probability function of the random 

variable t, that is, the rank of the y-value in the population to 


which the eae order statistic in the sample is equal. 
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Wilks (1962) has given the following results on moments of 


t. Moments of t are easier to derive from the following relation 


(ere (Oa) 


(4.2) EC (ttre1) (71 = ——— 
© 
where, cs = (ce Li KX—ErL), and + x is fixed. “ Puttine: r= 1 


and 2 in (4.2), we can get after some simplification, 


(4.3) E(t) = ee 
(4.4) V(t) kQ4+1) (Nen) (n-k+1) 


(oe) 


On the other hand, considering (4.1) as probability function of Y 


(k)’ 


we get 


a 8 t-1 —t N 
Ce. 5) EQ.) = y ° ( i ) ( ) : 
‘ pea) te Nay fae (i n 


Obviously this has no simple form and so is the variance of Daye 
We have mentioned earlier that (4.1) can be considered to be 


the probability that the ie order statistic of the sample will be the > 


oat order statistic of the population. So we may want to know the 


most likely value of t for given k. This is given by the value of 


t satisfying the following relation, 


- + 
(4.6) ea eet) Sr es) cag a Con 


N, N, 
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This implies after some algebric manipulations, 
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As a particular case of (4.7), we can find that sample median is also 
the ML-estimator for the population median of finite population. This 


can be illustrated by the following example. 


Example. Let our population size N = 25 and sample size n= 9; 
so that the a largest sample value is the sample median, hence 


ko. So--using 7k = 5 in relation (4.7), we have 


eoNcety se. ONG 
or 


PO Sby Coen Seles 


which shows that maximum likely integer value of t is 13, but 


¥O(t) = ¥Q(13) is the median of our population. Hence sample median 
ns is the ML-estimator of population median 9 (13) 


(5) 
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Median unbiasedness: We have already mentioned that the 


sample median is the ML-estimator for the finite population median. 
Now, we are going to show that sample median is median unbiased when 


sampling is done from finite population. 


Definition. If A is the median of a distribution and A isan 


estimator of A, then we say A is median unbiased if 
(4.9) B(Av<sA] = 9P[A > A] 


For continuous population P[A = A] = 0 and hence the above 
probabilities are equal to 1/2. But in general, for finite population, 
P[A = A] #0, so for median unbiasedness in finite population we shall 
consider the relation (4.9). To show median unbiasedness of sample 


median, we have to consider the following four cases. 


Ci) New and ni, both arerodd 
(2) N is odd and n is even 
(3) N is even and n is odd 


(4) N and n are both even. 


Cases lol Nandi ne -are: both-odd. 
Let N = 2M+1 and n= 2mtl, M and m are integers. 


Therefore YQ (MHL) and ena are respectively population and 
sample medians. Since (4.1) is probability function, we have 
N-nt+k 

y p (t) = 1.° So for k= mtl, we have: N-ntk = N-m and 
Pek N,n,k 
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It is clear from (4.10), that 


or 


(4.10) 


N-m 
nn EE ott) = Yocty! at a Brel) ? Yocty! 
or 
PUY mt) < Zooey) = PM try = Yoceay! 
PTY mtr) < Yoo) = Fl cir) > Yor! 


Hence sample median Y is median unbiased. But the situation 


(m+1) 
is a little complicated in other cases, where there is no unique 


median. 


Case 2. N is odd and ‘nls even. 


As before, let N = 2M+1l and n= 2m, M andm are integers, 


so that, conventionally sample median is an average of a andi. ¥ 


For k=m, we have N-ntk = N-n, n-k =m, so that 
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ee l a t-L N-t 
4.11 Ps tee 7 tnt d= se ( ) ) 
t=m (m) 0(t) ( p ) t=m m-1 m 
n 


and also for k = mtl, N-ntk = N-mt+l, n-k = m1; 


itp at t-1 
Chri) ) ine OCEWal (5 a Cy pes) 


t=m+t1 


Expanding (4.11) and (4.12) as in Case 1, we find that 


eon PIX (my * Yoaurty! PU Gait) < Yocet! 
PIG) * Foca! * PO mtry * Yooway) > 
ae PIX (my = Yocmay! * Pl¥ Gury = Yoaey! » 
and 
ee PIX ny 2 Yooury! = PE¥ cmt © Yor! 
“@) * “tt 


If we denote Y = as the median of sample 


2 
then it follows from (4.14) that, P[Y < Yor) ! = Bie > Yo mt1) | 


and hence median unbiasedness of Y. 


Example. Bet. No= 15.) enero. Sov that; of = sandy meni 5.. lne 


sample median Y =(Y 03) 4¥ (4) )/25 the population median y - Yocg)? 


O-()-™ 
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5005 ae 


and using (4.11), 
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135 
P[Y.., = jae 
[Y03) = Yocgy! 5005’ 
PLS 
P[Y > ga IP i 
[X¢3) 7 Yocay! 5005 
Again using (4.12), 
EES 5 
P[Y yee 
[Zea < Yocsy! 5005 °°? 
735 
P = po eee 
[Yay = Foca)! SO0s 
ea wis 
> 5: Wy cmasanenenens 
Re MiGan 5005 
Therefore, P[Y < y | = PIY >ly ] = eee) . And hence 
‘ (3) = 08) (4) = 08) 5005 ? 
eles Voces = ey egy 
Case 3. N is even and n is odd. 


Let N = 2M and n= 2m+tl, M and m are integers. Let 


the population median be Yo = %o 04) *%o cee 2 and the sample median 


is Carry: For k= mtl, N-ntk = N-m. Using these values and 


proceeding as in Case 1, we find that 


a PLY mi) = Yoon! = PIX cma 2 Yocssy! 
and hence 


P[ mang ] 


Ytmtl) * Yo! (rts 0 


establishes. that Y is median unbiased. 
(m+1) 
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Case. 4. Mi and) mare both even: 


Let NM =a2 and: a = 2m, Mand: .n are, integers.. Sample 


and population medians are respectively, Y =(Y + 
Pp pectively € ae X try 2 and 


Yo = (Yq cy *%0 cerry! 2 - Proceeding as in Case 2, we find that 


oe PIS m) = Yoan! * PM ary = Yoon! 


~ PY ay = Foret? Pome) > Yoanriy! > 


(4.18) RE Sane ae > Yoau1)! : 
and 
(4.19) Ee et Ss Yom ! es 2 Yocmtt) | 
Now, 
eo ~ Yoru) * Yom) 
€4.520) REY << Yo] = P[Y aru te Aaa ] 
Sanaa Yocuy | 
= PIS ay = Yoon! 
Similarly, 
(4.21) Ply > rial = Lf Fatt Yocmen)! 


Therefore, using (4.18), (4.20) and (4.21), we get 
PIX ¢ Yo] mht hee vas : 


which establishes median unbiasedness of ve 
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84.2 CONFIDENCE INVERVALS FOR QUANTILE IN FINITE POPULATION 


Let t be a fixed integer in the range 1<t<N. Then we 
can consider YO (t) as the (t/) €B quantile of the population Ta 
Tf Gy) = (ty), <> 1 <i<N)/N then we formally define the a 
quantile of finite population as_ sup{y : G. Cy) Sells yO SRL 
Similarly, the sample quantile is also defined. 

Confidence interval for aon in finite population is 
available in Wilks (1962, p. 333). Years later Meyer (1972). and 


Sedransk and Meyer (1978) extensively studied and extended results 


on this. eonfidence interval... For fixed t =t', 


(4.22) P[ Corer 


ey oe) Meee 


So for fixed N,n,t' and y >O, there is a largest k, say k' 


such that 
t' 
4.2 = A 
( 3) tek! Punk! eae iy: 
We shall consider Y as the best lower 100y% confidence limit 


(k') 
for Yor): Except for values N, n, t' and I1-y, which are 
uninterestingly small, such lower confidence limits can be shown to 
exist. Similarly, the best upper 100yY% confidence limit for Yoct'y 
is obtained by choosing the smallest k, say k", such that 
N-nt+k" 


(4.24) ) P 
t=t' 


N Sea, as: 


For the best 100Y% confidence interval for Yocrty? that is the 


simultaneous upper and lower confidence limits, the probability 
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function involved here is cumbersome. Meyer (1972) has given the 


following expression for simultaneous confidence interval 


EY ey % (ry | 


for Yor)? where 


(4.25) PLY 4) areas at 


ar t-i-1\ ,N-tti ter-l 7 t=i-2\ ,N-ttitl 

" i=o (ea aK tet) : bo Cu x n=r ) 
Pe Nin) DEC ANARAEGN aie Lace. 

& 

A simpler form of this expression is available in Sendransk and Meyer 

(1978). This paper also states that in forming a confidence interval for 

the Ge quantile, the confidence coefficient for population with ties 

is larger than the confidence coefficient for population without ties, 

proof is available in Meyer (1972). In fact the confidence coefficient 

for the em quantile for a population without ties is the lower 


bound for the confidence coefficient for the comparable confidence 


interval for any finite population. 


Confidence intervals in case of stratified sampling 
Let us now consider a stratified population of q strata 
having strata size Ns ae ; N, =N, the population size. 
i=l 


Let the population values in ascending order be: 


EERE GT CT) te oe “o1@l,) uc (6) yaapeaale vay %02(N,) ; 


< 


BRAS 
aa, “oq (ny) 


We have drawn a stratified random sample of size n from this 


population with n= ) Ny> where n, is the number of units selected 


c 
at random from i 2 stratum. Let the sample values be: 
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Let our interest be on confidence interval for (t/ny quantile of 
the population. It is sufficient to look at the stratum which contains 
.th : th : 
the i order y-value of the population. Let the t order value be in 
th : 
m stratum. Since a sample from each stratum is drawn independently, 
r t 
we shall consider for the (t/N) h quantile, the sample drawn from 


ic 
the m : stratum only. Let t" = t-(N, Fe oN ead oe rOCeedine tue 


m-1 
similar way as in section one of this chapter, we can find the probability 


that the ra order statistic of the Han stratum will be equal to the 


E Ho, : é 
c : order-statistic of the population, i.e., 


t'-1\ /N —t! Ne 
Gel T ie oP [Ye Ae yee y ieee m Jef ) 
m(k) O(t) Om(t") 4 k=1 nk ne 


ke kel eee «sues eK ee 
m m@m 
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Ersthe. t ~ order statistic of the population is not in the moh 


2 oe: t 
Stratum, - Duc Lovina : stratum, then 


(4.28) P[Y ] = 0 for. mee. ten. and 


mk) 0(t) o&(t") 


t: CS lace en ant 


£-1 
It appears from (4.27) and (4.28) that results for unstratified 
population can easily be used for stratified population with few 
changes in notations. Sedransk and Meyer (1978) have given results 
for a more general case of population. There they have not imposed 
the restriction (4.26) to the population and established results 


for population with two strata. 
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§4.3 JOINT DISTRIBUTION OF QUANTILES OF A SAMPLE FROM BIVARIATE 
FINITE POPULATION 


De ee es 


Let our bivariate finite population be (X51 °VQ7 02°%02 


See (on? Yon) of size N. For simplicity let us assume that there 
are no ties among x's and as well as among y's. Let ordered x—-values 


be < and ordered y-values be 


S01) eno ev LS SOON) 


< does not necessarily beong to the 


eee "oan Y0Q(i) 


We have drawn a simple random sample of 


Poauyats%o(2) 
same pair as that of =a0e 
size n from this population. Let the sample be (X,5¥5), 


Goel, teie stk As above let us denote the sample ordered X-values and 


Y-values as X SE ERS Ga) and Y < respectively. 


(1) Gb a net 


Our objective is to find the bivariate distribution of sample Gyn ee 
; y th , 
quantile of x and (j/n) quantile of y. Let us assume oes and 


4) be corresponding sample quantiles. Siddiqui (1960) has derived 
the joint distribution of Xeiy2% yy? when the sample was drawn from 
a continuous bivariate distribution. 


Distribution of Kay 2% 5)? for a sample from finite 


population depends upon the nature of pairs of values (%55°Vo4? in 


the population. Analogous to Siddiqui (1960), we shall introduce two 


new variables My and M, where, 


M = number of pairs (X>¥,) in the sample with x < Xa) 


and YE < Lesa M is a random variable which may vary 


from sample to sample. 


m. = number of pairs (x 


0 01° 0%? in the population with x < X55) 


Oi 


and Voi < Yor)’ my is non-random if one considers the 
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finite population as fixed and my varies for different 


values: of +S j-and - t. 


Here, M is a dummy variable which is to be summed out from the joint 


distribution of (M, xX Y,.,) to get our desired joint distribution 


Cok: Gh, 
Se ory ay 
Let (9° Yocey) and (9 (5) °¥Q) be two units of the 


population which lie on the lines and x= 


sola) *0(s) 
respectively. So any one of the following five case may occur. 


Case l. Xo < ave as Yoct) 


Case 2. Xo < XO (5)? Yo > Yoct) 
Case 3. Xo > XO (3)? Yo < Jot) 
Case 4. | De fe Xs)? Yo > YQ (t) 


Case 5. Xo = x1) in which case there is only one point 
(X5»Yo) common to both lines x = XO (s) and 
= Paha : = 
y Yo(t) n such a case, the pair (X52Vo) 


(X95) Yocty? is a measure of a unit in the 


population. 


Let us now find the P[M=n, Rei) = XO (5)? 4) = Yoct)! under 


the above mentioned different cases. 


Case l. If our population satisfies Case 1, then a possible 
distribution of population and sample values is given in Figure l. 


For simplicity of figures, we shall assume that x-values are all 


positive. 
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x 


Z *0(s) 


N-m,-s-t+2 


h—-M—3— 742 


y= YO(t) 


FIGURE 1 


In the above figure, N+m)~s-t+2 (or nt+M-i-j+2) represents the 


number of pairs (X54 °Vq4) (or (X,5¥,)) in the population (or sample) 
satisfying Xo4 > XO (5) and You? Yo(t) (or xX. > x (3) and 


x ). Similar meanings apply for other numbers of the figure. 


> 
i” %o(t) 
Points marked by © corrspond to the units of population which lead 


us to consider Case l. 


Since our sample is a simple random sample of size n and 


drawn without replacement, therefore, 


(4.29) Pim =m, Xa) = ee = Yorty! 
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where, 


mt =/40,1,...,m) = min (m5 > i-2, 4-2) 
s4 3" 1,170) 2.09 ri 
CP Se on Ne] 


So (4.29) can be considered as the joint probability function of M, 


Xx and: 1% with mass points at M = 0,1,...,m', 
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and for the rest of the thesis, we shall assume that for any integers 


pr vand.q, 
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Case 2. Proceeding as in Case 1, the configuration of the sample as 
well as the population values that satisfy Case 2 is given in Figure 2. 


Hence the required probability is 
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Ntmj)-s-tt1 


n+M-i-j+1 


X 
FIGURE 2 
where 
i. G05. ym' = min (m5> i -2,j-1) 
Sm 1 bi ag Nee 
iS 4 j ry 1 fe ° © 9 N-n+j 
Case 3. The appropriate configuration for the sample as well as for 


the population is given in Figure 3, below. 
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The required probability for such a configuration is 
G.31 POM) =m. Xie See 
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where, 
Mi =) Oe ey mT (m5 i-1,j-2) 
s = i,itl,...,N-nti 
ie. PS] eat elsare chy Oe due ns 
Case 4. The configuration for the sample and population corresponding 


to Case 4, is given in Figure 4, below. 
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The probability for such a configuration is 
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where 
ir ma Loe amt (m,,i-1, j-1) 
See ae reece. nto 
t = j,jtl,...,N-nt+j 


Case 5. A suitable configuration for Case 5 is given in Figure 5, 


below. 
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The required probability is 
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k= 1,2,...,5 depending on which case we have at hand and 
SHS ed ht een g NMS. CoS yj; s...N—-n+4. 

Now, if we consider our finite population is a sample from 
a super-population with continuous distribution fumction F(X,Y), then 
our My» O(a) and oct) become random variables. In this 
circumstance (4.34) will be treated as a conditional distribution of 


x given mm, 


Cuweny tate mee Sites 


in such a case, is a sample from a continuous distribution, so the 


Since the finite population, 


marginal distribution of (m) >» ) will be as given in 


*o(s)? 70(t) 
Siddiqui (1960). 


§4.4 PREDICTION OF FINITE POPULATION QUANTILE USING AUXILIARY VARLABLES 


In this section we would like to investigate the possibility of 
using available information on the auxiliary variable, x, to get a 
_ better estimate of population quantile of y in finite population sampl- 
ing. Keeping the above objective in mind, we derived the bivariate dis- 


tribution of quantiles ( as given by (4.34) of the preceeding 


Ae ey 
section. But we could not give a simpler form to this distribution. 
Consequently we were unable to propose or investigate any reasonable 
estimator for the population quantile of y using quantiles of x. 
However, if we assume certain multivarite models, namely, 


Model M or perhaps the more traditional multivariate normal 


distribution model at the back of our realized finite population, 
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then it appears that we can get a "predictor" of finite population 
quantiles using auxiliary information. 
Here we are going to use notation developed in Chapter 2. 


! 
Het’ i= (Yjo+++s¥) be the N-dimensional random vector giving the 
t 
finite population y = (Yyo+++sIy) - Under Model M,» let the 


E-expectation and €-variance of Y be 
E(Y) = XB and 1509 se cmuis ular 
where X is Nxp matrix, g is pxl vector and 


V = diag odie) a known NxN non-singular positive definite 


matrix. Let sample size v(s) =n>p. Let us partition Y as: 


where i is nxl vector of sampled mits and Y_ is the (N=n) x1 
s 
vector of non-sampled units. Accordingly we can partition X and V 


as follows: 


where X is nxp and Mh Su TAN etc. 
Ss 


The minimum variance unbiased estimate t for 
N BLU 
population total y Ye as given in Theorem 2.8 is 
L 
‘= gty + gtx B 
(4-35) "BLU Hecak ie 2 =" BLU 
where 
a -1 -1 -1 
= ry Ty: 
BaLU ss s x.) s = 
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t 
and hy and h, are, vectors, of .the form (1, .:..51) having 


dimensions n and N-n respectively. 

On the other hand, if we assume our super-population 
model is N(X8,V), i.e. Y ~ N(X8,V), then the following theorem 
due to Royall (1976a) gives the maximum likelihood estimator for 


the population total. 


Theorem 4.1. If Y has a N(X8,V) probability distribution in 
which the known diagonal convariance matrix satisfies V2 = Xy, 


Coe C1 yl wun tor some p—vector -¥5.. 1.8% oO; = 5 Veen es sd COT 


when zh nae is observed, the likelihood function for t= &'y is 


proportional to the Nit, is var(ty, 1) } probability density function, 


U 


where tory is given by (4.35), and 


a -1 -1 =-l 
: Si 0 Vata eee X WM 8h : 
(4. 36) var(t,, 1) Aes Vis = _ (Xe : oo) X'h_ 
ss Ss ss s 8s 
O 
The above theorem suggests that, t is the best linear 


BLU 


unbiased estimator under the normal super-population model, and that 


var(t,, ) is its variance under the same model. 


LU 
It is interesting to look more closely at (4.35). The term 


1 ° 1 as e ° ° f 
Lay. is the observed sample total and “2% Paty is a prediction o 
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predicted value of Y , obtained using y in 8) to get the 
3 s ELD 


predicted population values of y-characters, viz., y= cae Once 
we have the predicted opulation zy at hand, we can easily sort out the 


required predicted quantile for finite population. A predictor of 


Ce/ny quantile of finite population will be the corresponding 
quantile of the above mentioned predicted population. Obviously, the 


predictor suggested above uses auxiliary information through ae 


t 
As before, let Oe) be the (t/N) 2 quantile of the finite 


“aw 


population which we are going to predict by Y as per above 


Ge) 
suggestions. At this stage it is required to investigate properties 
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such as §-unbiasedness and €-MSE of Y Let nec Qeer qu ls 


(ole 


th 
and bea be the sample q quantile obtained from x, without 
using the auxiliary information. This oe is the commonly used 


predictor of corresponding population quantile Now if 


x : 
Ce) 
EC) - Ses viaie cor = sale for any sample s cad, then 


we shall have our proposed predictor oe at least as good as the 


predictor ee and if the strict inequality holds for some s, then 


our predictor Yee) will be better than the predictor ae under the 
Model Gur or multivariate normal super-population model. 


To study these properties we need moments or distribution 


of Yee) which at present we are unable to find. If we assume the 


above mentioned multivariate normal super-population, then the 
marginal distribution of Y is also multivariate normal. But the 
Ss 
joint distribution of Y' = (viey!) is multivariate normal with 
Ss 


singular variance-covariance matrix with rank n. Obviously ¥,'s 


are no longer independent as well as identically distributed rather 
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the distribution depends on the colums of X. Exact distribution or 
moments of order statistics for dependent variates have been studied 
by Young (1967), Greig (1967) and Afonja (1972). But there they 
considered the parent distribution is either exchangeable or has 

equal correlation among the variates or has non-singular variance- 
covariance matrix. In survey sampling both N and n are usually 
very large. So, asymptotic properties (as N,n +o) may be of some 
interest. Some general results on the asymptotic behaviour of function 
of order statistics with different mixing types of dependence are 
available in Gastwirth and Rubin (1975) and Mehra and Rao (1975). But 
these types of dependence apparently do not correspond to the nature 

of dependence we have in our x - So, for developing useful properties 
of our estimator Yee? further study will be required on the exact 
and asymptotic distribution of order statistics and their functions 
where the sample is drawn from the population with singular variance- 


covariance matrix. 
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CHAPTER V 


ASYMPTOTIC RESULTS FOR SAMPLES FROM FINITE POPULATION 


§5.1 INTRODUCTION 


It is a common practice in survey sampling to use the central 
limit theorem for large populations and through this central limit 
theorem we use standard tests for testing hypothesis concerning finite 
population parameters. Rosen (1964) has given a systematic analytic 
basis for asymptotic behavior of our statistics based on sampling 
from finite population. There are also some earlier works in this 
area, namely, Erdos and Renyi (1959) and Hajek (1960). Recent works 
on the asymptotic behavior of order statistics and quantiles of a sample 
from finite population are Jha (1975) and Singh (1980). In this 
chapter we shall state some of these results and then we shall derive 


the asymptotic bivariate distribution of sample quantiles. 


85.2 SOME ASYMPTOTIC RESULTS 


Usually we consider our population, 

T= (Yor? ° °° Von» as a finite set of fixed numbers and the sample 
of size n drawn from this population is 52+ ae When the 
sampling is done without replacement, samp le observations become 
correlated due to sampling. Although this dependency can be ignored 
for sufficiently large population size WN, for samples drawn 
without replacement the conventional limit procedure for independent 
observations as n-+>° does not have any meaning. The population 


will be-exhausted after a finite number of drawings. So many authors 
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considered a double sequence of random variables as follows: 


is a random sample from ah 


Nenscg 0 RP AY fi i 
kl? k2? ’ k,n, is a random sample from 7 


They considered the limiting behaviour of statistics based on the 
sequence als of population and assumed that the Tan population 


size N,. ee as Ke oo Let 
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(5.1) 


respectively ka population mean and variance. Let Fy be the ae 


population distribution function obtained by giving weight 1/N,. to 


each element of T° Fie is assumed to be right continuous. The 


centered distribution function Fy) is defined by FE (y) = FL. (y-u,) + 


Let zk) 


ae ig Yi +... +yY,_)/n. Then the following two theorems 


kn 


establish the convergence of sample mean in finite population sampling. 


Theorem 5.1. (Rosen, 1964): Teper) be a sequence of populations. 


=(k 
A sufficient condition for ms te to converge almost surely to 0 
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(5.3) lim sup f ydF(y) = 0 


then condition (592) “is necessary for Y “Uy to converge in 
n 


probability to 0 for sSS {n}- 


ky 
0 
Theorem 5.2. (Rosen, 1964). Necessary and sufficient conditions 
—(k 
that a a converges to 0 with probability 1 for every SSS 
(a) with ny +o when k->o is that tnd satisfies, 


G..4) lim sup f ly|dF, (y) = 0 
A+o k |y|>A 


Central limit theorem for finite population has been studied 
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by Erdés and Renyi (1959) and Hajek (1960, 1961). The following theorem, 


due to Hajek (1960), gives a necessary and sufficient condition for 


sample total to be asymptotically normally distributed. As mentioned 


in Chapter 1, let U,. = {1,2,....N,} be the label set of the ra 


population (t,) umits. Let s,..be a simple random sample of size 


k k 
th ak) eee 
ny from U..: So that the k sample total n,. ¥ aes Yet has 
iés 
mean and variance equal to no and (CN, =n, ) /Ny nyo = Dy» 


k 
respectively, where Hy and ay are as defined in (5.1). 


Theorem 5.3. (Hajek, 1960). Let Sy be the subset of elements of 
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holds, where D. is the variance of pt sample total, mekiny 


Suppose ny, +o and QW -n,) > ©, 
(k) 


Then the random variable a, has asymptotically normal 


distribution with parameter (nu, »D,) if and only if 


2 
2 BS eee 
ha 
(5.6) lin. ———————————_ = 0, forvanye: we 0. 
N 
k > ik 
} Coen ie 
fee ctor 


Estimation of quantiles is usually considered with hardly any 
restriction concerning the distribution. In the first situation an 
efficient estimator for the unknown quantile can be derived from the 
efficient estimator of the unknown parameter. In the second case, 
the natural estimator, namely the sample quantile, cannot be beaten, 
Reiss (1980). We shall now discuss asymptotic behaviour of sample 
quantiles. For this, we need the concept of empirical distribution 


Y from T, 


function G(t,n) corresponding to the sample Yipeces - 


which is defined as 


il Th 
(5.7) eterna eet Se 


EGe-Y oF, Gy oa es 
j al 


i 


where I(:) is indicator function with 


aot we< 0 


tl 
co) 


(5.8) I(u) 


If no complexity arises then we shall use G(t) for G(t,n). 
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Leto 0? <4 pi.stix" Thea® the aa quantile of a distribution 
function F(t) is defined as supremum over the t-values for which 
F(t) <p. Analogously, we define the empirical wae quantile 
corresponding to a sample of size n from wm as the surpremum over 
the t-values for which G(t,n) < p. The following enearan, due to Rosen 


h 


(1964), gives the asymptotic behaviour of empirical po quantile. 


Theorem 5.4. Let Y@,n,) be the empirical Ao quantile ina 


sample of size ny froma: TF k = 1,2,... .. We assume that there is 


k? 


a continuous distribution function F(t) such that 


(5.9) lim vay sup. |F° (t) - F(t)|» = 0 
k> © t ™ 


and, furthermore, that F(t) is continuous and positive in a vicinity 


of the ae quantile We Ofte CE) cas Now, Lt lim ny = 0 and 


k + 
n 
Lim = <1 
k 
then for every real a, 
F'(n_)Y¥(p,n,)-n 0 y) 
r ia -x /2 
(5210) (i VP pe re RS Pera ce | out se tdicmeae 
Ss is 1 00 
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This work of Rosen was later extended by Singh (1980). Singh 
(1980) has shown that after proper normalization, the weak limit of the 
process g(t) is ee where W° is a Brownian bridge on D[0,1], 


the space of all right continuous functions on [0,1] having left hand 
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limit (for details on the space D[0,1] please see Billingsley (1968)). 
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SAS) ASYMPTOTIC BIVARIATE DISTRIBUTION OF SAMPLE QUANTILES 


In Chapter 4, we have derived the joint distribution of 
(Xie aye for a sample from a bivariate distribution of finite 
population. There we considered eS and EG) as sample Cues 
quantile of evans and eve quantile of y-values respectively 
and depending on the population configuration, we derived five 
different forms of probability functions for (Kay? rep In this 
section we shall study asymptotic behaviour of those distributions. 

We shall first consider the probability function under Case 1 


of Chapter 4, (relation (4.29) with s replaced by r). Let us 


r-m,-2, t-m.-2 and 


assume that N is so large such that my» 0 0 


Ntm)-r-t+2 are also sufficiently large for applying the following 


approximations: 
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Case 1 implies that units (9 (Ly °%q? and (5 >Vocty? are both in the 
sample s, and Xo < x0 (x) » YG < Yocty: Probability that these 


two particular units will be in the sample is, 


1 


(5.16) Va-ebD Prit (Xo (727)? (VQ ¢4y)} e. si 

For large N, (5.16) can be approximated by 
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Let us denote lim r/N= a, lim t/N = 8 and the joint 
N > © N> © 


distribution function of = (Xoy). as’ .N-> ©" by F(X,Y),. which we 
shall assume continuous. So for large N, we can express the right- 
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Similarly, 


<¥<y, + dy} = “Guy! 
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Our (5.21) is exactly the same as in (3.1) of Siddiqui-(1960). Similarly 
we can approximate for Case 2, Case 3, Case 4 and Case 5. Hence our 
bivariate distribution of (Keay: Tage for finite population, as 
conforms to the bivariate distribution of (Keay? aye for 


continuous population as derived by Siddiqui (1960). 
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